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AUTOMATED COLLABORATIVE different people who are simultaneously viewing or inter- 

FILTERING IN WORLD WIDE WEB acting with the same content. For instance, a particular Web 

ADVERTISING page may have an area reserved for advertisements. Anyone 

CROSS REFERENCE TO RELATED of avera S e experience in the field of Web programming 

APPLICATIONS 5 wou ld be able to create code to show different advertise- 

TOs application claims priority from U.S. Provisional " eats t0 * fferent ^ simultaneously viewing that page 

Application No. 60/009,286 filed on Dec. 27, 1995 and U.S. ^ f can be 'ccomphshed, for instance, by means of a CGI 

Provisional Application No. 60/012,517 filed on Feb. 29, scn P L 

1996, both having named inventor Gary B. Robinson. Since different people have different interests, it is appar- 

__ T iM\/uwTfnM 10 ent tnat lhis can ^ a us^ 1 thing to do - But me Q uest ion 

FIELD OF THE INVENTION remains: how do we determine which advertisements to 

This invention involves the display of advertising to users choose for a particular viewer? 

of an interactive communications medium. It is particularly -j^ mvent ion is based on the fact that people who have 

useful with the World Wide Web, which utilizes a commu- shown a teQdency for intefests and ^ ^ dislikes 

nications protocol on the Internet. 15 m the past ^ usually ^ntim* to show a tendency for such 

To access the Internet, and to carry out the methods similarities in the future. In particular, people who have 

described in this document, one must have a CPU, RAM, shown a historical tendency to be interested in the same ads 

Internet connection (for instance, through a phone line and m the past usually continue to display such a tendency 

modem), input device such as a keyboard, and an output as ^ goes on peoplc who strong i y display such 

device such as a TV, CRT or LCD. 20 similarities with respect to a particular person (who we will 

All of the above-identified hardware, necessary to carry rc f er lo as "the subject") are referred to as that person's 

out the steps described in this document, will be considered "community/* 

to be implied in the following description of the present ff the mem bers of a particular consumer's community 

mvention. tenc j t0 on a particular Web ad, then there is a certain 

INTRODUCTION 25 likelihood that the subject consumer will also tend to click 

t fl Under the old model for the advertising industry, the on that ad. w _ _ . . 

S subject matter of one "unit of publication" (a magazine, a ?> take advantage of this fact, this invention combines 

W newspaper section, a radio show, a TV show) was often the techniques for solving two problems: determining the sub- 

£ sole means an advertiser possessed in order to guess the 30 Ject's communUy, and deterging which ads to show based 

I . i interests of a particular reader or viewer. If for instance, the on characteristics of the subject s community. 

*"j magazine was about cars, advertisers knew that anyone In this invention, the information used to determine 

•%= reading it was highly likely to be interested in cars. whether a given individual should be in the subject's com- 

£ However, on the Internet's World Wide Web, multiple muait y ? s ^ eai \ ed from the of me dividual in the 

m units of publication-that is, multiple Web pages and user 35 ^eractive medium m quesUo^n For instance, when the 

actions over time-^an be used to determine the interests of interactive medium is the World Wide Web, the mformaUon 

a each individual. Moreover, this information can be gathered ™? ™>lve such facts as the choices of Web sites the 

□ very inexpensively. To do this, we take advantage of the fact individuals have each visited the frequency of such visits, 

m that a Web user's actions can be tracked over time. TTiis rich ^ nature of ^ C0D j ent at . those Sltes > c \°' If u the f lteS ar * 

Z\ source of information about each person will be used to 40 onlm | stores ' the information may mvolve the choice of 

lV bring about an era of far more efficient advertising. The speafic items purchased as well as the prices of those items. 

v0 information used includes not only which sites were visited M another exam P^ Slte 15 an entertainment recom- 

S by the user and for how long, but also which ads the user mendation service based on user-supplied ratings (Firefly at 

J clicked on, as well as other information. www.ffly.com is an example), the ratings can be used. One 

M> Under the old model, as it exists on the Web today, most 45 more example is the selection of Web ads each individual 

rtU . . f . . , tt . . . . « - U1 , has chosen to click on. In one embodiment, there is a feature 

of this information is ignored. It is technically possible to . ■ . „ ■ j- -j i , • j- * *i_ • j- • * 

„ . . i4 . * „„«„ i Anna qpL • ' . , o . which allows individuals to indicate their disinterest in an 

acquire it, but it isn t generally bemg done. This is due to , . . . . . 

reaionsofmomentumoftheoldmodel,lackofwell.known ad|lhis senses as adthtional ,nput 

software and statistical tools for making use of the Tleye needs to tea means to track a consumer s activities 
information, and, not insignificantly, fears of an invasion of so » the information he generates can be tied : together m the 
privacy (a problem that must be dealt with and that this d f^ ase - In one embodiment, this is accomplished by means 
concept paper will explore below). But this information, of Netscape-style "cookies which are stored on the con- 
when acquired and used, will be extremely useful in trying sumer s hard ** ***** CGI 00,1110 • In other embodiments, 
to make sure that each square inch of the limited Web *> ftware ma ™Z on the consumer s computer, such as an 
advertising space on each site is used to effectively reach 55 Netscape-style in-line plug-in, a Screensaver working m 
individual customers. conjunction with the Web browser, or the Web browser 

This ignored information, because of its power to enhance itself, is used to tie the data together 

advertising effectiveness, is extremely valuable. J" 1 " mfcnnation is used as the basis for calculations 

w 4 , c ... . c »• u «4 ♦ i *u which generate a (usually numeric) measure of similarity 

Moreover, the use of this information benefits not only the , t • j- -j i c i c u - -i ■* 

, * .u • 4 .j • i «n between mdividuals. Examples of such similarity measures 

users, but also every one of the interested commercial 60 n , 4 y c ,. . ... J . 4 . G 

... . . ' , . * wt l *a r? u are well-known to programmers of ordinary skill m the field 

entities — advertisers, ad agencies, and Web sites. Each . , . fi| ; " J 

entity will be economically motivated to facilitate the move ol collaDorative miermg 

to the new paradigm. ^ individuals with the greatest calculated similarity 

become the subject's community. 

SUMMARY OF THE INVENTION 65 In one embodiment clusters are formed of groups of very 

On the World Wide Web, and other media such as similar consumers. Then, the subjects community consists of 

interactive television, it is possible to show different ads to all or some of the other members of his cluster. 
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The next major task is to decide what ads to show the which of the one or more advertisements to present to the 

subject based on his community. subject based on the subject's community by displaying a 

In one embodiment of the invention, a new ad is displayed new advertisement for a training period and determining 

randomly or on a fixed schedule to a certain number of users. y he , ther a hi 8 h . or J ow P^tt^ of members of the sub- 

During this "training period" for the new ad, a certain 5 ject s communi y have chosen to v,ew further informaUon 

percentage of the members of the subject's community will about me advertisement (step 40). 

click on it. If this is an unusually high proportion, then there Smart Ad Boxes 

is a relatively high likelihood that the ad will be of relatively The centerpiece of this invention is the "Smart Ad Box." 

high interest to the subject. In one embodiment, statistical A Smart Ad Box is an area on a Web page (usually 

techniques are used to determine a probability, associated 10 rectangular) which is used to display Web advertising, 

with a fixed confidence level, with which we can assume a Special software algorithms are used to determine which ads 

randomly-chosen member of the subject's community will are shown to which users; different visitors to a Web page 

tend to click on the ad; this probability is used as the measure can simultaneously see different ads. 

of similarity. Other embodiments involve other analytic a number of factors can be used by the software in 

techniques. 15 determining which ads to show. For instance, based on their 

There are a number of additional features found in other Dec. 6, 1995 press release, the company C|Net appears to be 

embodiments of the invention. planning to implement a Smart Ad Box-like system which 

In one embodiment, the advertiser specifies the demo- decides which ads to present to which users based on such 

graphic profile he wants to show the ad to. In that case, as information as the type of Web browser they're using, their 

long as we have demographic information available for a g e > gender, Internet domain (EDU, COM, etc.) and other 

some consumers, the system targets ads by considering the demographic information. A Dec. 19, 1995 press release 

subject's community members who have supplied demo- from Novo Media Gr0U P indicates at least somewhat similar 

graphic information. For instance, by computing the average plans. 

age of the members of the subject's community who have ^ This invention involves using automated collaborative 

supplied their ages, the system is enabled to make an filtering (ACF) either instead of, or in addition to, the 

"intelligent guess" about the subject's age, and use that above-mentioned techniques. (ACF is also referred to as 

guess for the purpose of targeting ads. social information filtering.) As far as is known, there is no 

In one embodiment of the invention, special Web pages or P rior art that mvolves ^ ACF * determining which ads 

sites are supplied which enable advertisers to specify spe- 30 to snow t0 w " om - 

cific specific sites they would like their ads to run on (or not For ease of discussion, this patent will focus exclusively 

run on); similarly, special Web pages or sites are supplied on the ^ of ACF in Web advertising. However, it must be 

which enable Web site administrators to specify ads they stressed that ACF can be used in a complementary manner 

would like to display or not display. to techniques such as those C|Net and Novo Media Group 

In other embodiments, means are supplied for consumers 35 are developing. ACF can give us a certain amount of 

to specify and update their demographic information; these evidence that a particular ad should be shown to a particular 

means take the form of a Web site or page in one ^ such formation as age, sex, Internet domain, etc. can 

embodiment, and software running on the consumer's com- °° nsi ere as we * 

outer in another Fiom the point of view of a Web site hosting a Smart Ad 

. i * j . p ah Box, the SmartAd Box consists of a small amount of HTML 

In some embodiments, software running on the consum- 40 , ' wf . . • , ut\vit ~,a* ™u 

. . . . . A f- . i A . code. It may optionally involve non-HTML code, such as 

er's computer makes the choices about which ads are to be T It «. „ n nni _ llt - 

j- i j £ i ™_. ... . . . j . Java. It involves calling a CGI routine. 

displayed for that user. This embodiment has the advantage & 

that it obviates the need for a central database storing Wnat me user sees ' 

detailed information about consumer together with an iden- when a Smart M Box »PP«is 0D a P a S e > a user viewin S 

tifier for each consumer; so the consumer's privacy is 45 that page will see an ad which is targeted to that particular 

protected user. Thus, simultaneous viewers of the same page will often 

be presented with different ads. The ad is visually contained 

BRIEF DESCRIPTION OF THE DRAWINGS in the Smart Ad Box. The Smart Ad Box may or may not be 

rectangular in shape; it will often, but not necessarily, exist 

A more complete understanding of the present invention 5Q k a ^ regkm Qn ^ screen 

and the attendant advantages and features thereof will be The Smart Ad Box will present different ads to a user over 

more readily understood by reference to the following time Ce ^ ^ same ad oyer and ovef 

detaileddescnptionwhenconsideredmconjuncuonwiththe ^ {s nQi maximaUy effective ^ ^ would simply 

accompanying drawings wherein: become used tQ u md WQuld therefore ^ lo ignore it 

FIG. 1 is a flowchart diagram for the steps performed in 55 jhis invention involves rotating the user through different 

selectively displaying one or more advertisements to a ads which are of likely to be of interest to that particular user, 

subject in accordance with the teachings of the present 7h e rotation schedule can be chosen for maximal overall 

invention. advertising effectiveness. One way to measure effectiveness 

DETAILED DESCRIPTION WOuld be me fre ^ uencv of clicks 00 ads m Smart Ad 

60 Boxes — the rotation schedule could be chosen to maximize 

Referring to FIG. 1, a preferred embodiment of the this number. It could involve such information as the number 

present invention is shown. In this embodiment, the system of times the user has seen each ad in the past, and the 

begins by tracking activities of the subject in the interactive predicted likelihood that the user will be interested in the 

medium (step 10). Next, the system derives information given ad. Another factor that could be considered is reso- 

from the activities of the subject (step 20). The system then 65 nance with the Web page showing the ad — perhaps ads that 

determines a community of the subject using all or a portion relate in some way to the subject matter of the page will be 

of the information (step 30). Finally, the system determines more likely to be clicked on. 
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Like most current Web advertisements, clicking on a 
Smart Ad Box will cause the user to be transported to a Web 
site chosen by the advertiser. 

Moreover, particular implementations of the present 
invention can optionally include certain additional features, 5 
such as the ability to reject an ad — for instance, with an 
option-click of the mouse. A user would do this for an ad that 
had no interest for him. The rejected ad would automatically 
be replaced with another ad targeted to that user. 

Control features for advertisers and Web site managers. 10 
The central database can optionally contain rules or 
control records provided by advertisers and Web site man- 
agers. These could be used for the following purposes: 
An advertiser may not want to be associated with certain 
Web sites or types of Web sites; alternatively there may 15 
be certain sites or types of sites they would like to be 
associated with as strongly as possible. Advertisers 
could specify such inclinations, and they can be stored 
in a database. Then, when the software is choosing the 
next ad to show to a particular user who is visiting a 
particular Web site, those factors can be taken into 20 
account. 

Similarly, a Web site may prefer certain advertisers or 
advertisements or types of advertisements to others. 
The Web site can specify such inclinations, and they 
can be taken into account when the next ad is chosen 25 
for a particular user currently visiting that Web site. 
One way for advertisers and Web sites to supply these 
rules would be for a Web site to be constructed which would 
do the following: 

1. There would be a page which would present advertisers 30 
with a list of Web sites which are currently running Smart Ad 
Boxes. (Optionally, these Web sites could be grouped 
according to subject matter. For instance, Web sites con- 
cerning automobiles could be grouped together. In addition, 
individual pages of Web sites could be listed. Thus, there 35 
could be a three-level hierarchy.) 

2. It would allow the advertiser to input identification 
information about an ad — for instance, its URL. This will 
tell the software that the information given will apply to that 
particular ad. 40 

3. It would allow the advertiser to indicate which Web 
sites he would like to have display his ad. If Web sites are 
grouped by subject matter and/or individual pages are listed, 
the advertiser should be able to indicate choices on those 
items, too. 45 

For instance, a check box could appear next to each item. 
If the advertiser clicks a checkbox for an item which has 
subordinate items (for instance, the user may have clicked 
on the checkbox for a Web site which was listed with its 
individual pages) then the checkboxes for the subordinate 50 
items could be automatically "checked" or "filled in" by the 
software. (Java or JavaScript could be used to do this in 
"real-time" instead of requiring the user to submit the form.) 
But a number of other mechanisms could be used instead of 
checkboxes — for instance, the listings could change color to 55 
indicate having been chosen. Checkboxes are probably 
preferable, though, since their meaning is so intuitively 
clear. 

4. Optionally, there could be a page that would work the 
opposite way. It would allow an advertiser to identify a 60 
particular ad, and then it would allow him to specify the sites 
(subject groups, pages) which have Smart Ad Boxes but 
which he would rather not allow to show his ad. Thus, his 

ad would be distributed to all pages with Smart Ad Boxes 
except those that were specified. Again, pages could be 65 
specified by means of checkboxes at page, site, or subject 
matter levels. 
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5. In the above, whenever a page is listed, it should 
optionally be possible to click on the listing to be transported 
to that page in order to investigate it. 

6. Optionally, advertisers should be able to retrieve the 
information already entered for a particular ad. For instance, 
an advertiser may change his mind about showing an ad on 
a given site. So, by specifying the ad's identifier, the 
advertiser should be presented with a listing of pages which 
indicate the choices he has already made; ideally, he should 
be able to change those choices using the same techniques 
used to enter those choices originally — for instance, by 
clicking on the checkboxes. 

7. Through the pages described above, an advertiser 
would be able to specify the pages which will be allowed to 
display the ad. However, the Web sites with Smart Ad Boxes 
also need to have a choice. So a page could be set up for 
them which listed all the ads which they are allowed to show. 
As in the other case, checkboxes could be used to indicate 
which ads will be chosen; again as in the other case, the 
webmaster should be able to indicate either the specific ads 
he wants to present on his page (automatically disallowing 
the rest) or the ads he doesn't want to present (automatically 
including the rest). 

8. As in the other case, allowed ads could be presented 
hierarchically by subject matter, with checkboxes at both 
levels. 

9. The ad listings could, optionally, consist of the ad 
banners themselves. Alternatively, they could be "hot- 
linked" text that the webmaster could click on to be trans- 
ported to a page containing the banner (which might addi- 
tionally have other information supplied by the advertiser 
about the ad). There should optionally also be a way for the 
webmaster to visit the site that the banner will be linked to; 
this could be accomplished simply by hotlinking the banner 
to the site, just as will be the case for users. It could also be 
accomplished other ways, including having a button, next to 
the listing for the ad, which is hotlinked to the related site. 

10. Alternatively, the system could work the opposite way. 
Instead of enabling advertisers to offer ads to chosen Web 
sites, the process could start with Web sites offering pages to 
advertisers, which could then choose which pages they want 
to accept. 

11. In cases where hierarchies are displayed, the hierar- 
chies could be collapsible, similar to the way files are fisted 
in the Finder of Macintosh's System 7 operating system 
when View is "by Name." This would enable people using 
the lists to navigate them more effectively, especially if the 
actions for expanding and collapsing hierarchy levels were 
very fast. To achieve a quick and responsive user interface, 
a Java applet could be written which handled some or all 
aspects of the user interaction. 

Control features for users. 
Demographic data. 

Web pages can optionally display a hot link to a site where 
users can enter their demographic data. Users can optionally 
be given the ability to modify their demographic data at any 
time. Finally, if they wish to, they can optionally be given 
the ability to delete their demographic data at any time. 

This control over their demographic data will alleviate 
many user's privacy concerns. 

In addition, users should have easy access to information 
stating how the demographic data is used, and who has 
access to it. 

It will probably be the case that some users will have less 
concern about privacy issues than others. The Web site that 
allows users to update or delete their demographic data 
could optionally also allow users to specify a chosen level of 
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privacy. For instance, some users might wish to allow 
companies to have access to their demographic data in order 
to receive certain special offers (which could be made by 
direct mail, email, or other means). Optionally, there could 
be list where users could choose companies which will be 5 
allowed to have access to their information. For instance, a 
check box could appear next to each company name. As 
described in the section of this document which discusses 
the means by which Web sites would choose advertisers, 
users could choose companies by means of a hierarchically- 10 
organized list, grouped by product category. Again, the 
hierarchies could be collapsible in order to increase ease of 
navigation. Of course, the companies could alternatively by 
listed in some other manner, such as alphabetical order. 

Users can be induced to supply such data by special offers 15 
such as discounts on selected merchandise. 

Tracking Data. 

Users can optionally be given the ability to tell the system 
not to store their tracking data. (If the user elected not to be 
tracked, the system would have to decide what ads to display 20 
based on other means, such as domain type [EDU, NET, 
etc.], browser and computer types, demographic data that 
had been obtained, etc.) 

Storing data on user's machine instead of in a central 
database. 25 

As still another option, it would be possible to store the 
tracking data only on the user's own machine, so that the 
data would be completely privateuit would never have to be 
compiled on another machine. 

This means that the criteria normally used by the system 30 
to decide which ads the user will see and the order these ads 
will be displayed in will have to be sent, across the Internet, 
into the user's computer; decisions about the ads will be 
made there. 

Let's refer to a user who has elected to store his tracking 35 
data locally as Sam. 

For instance, to make use of ACF (discussed elsewhere in 
this document), the tracked history of various users (or some 
subset of that information) will have to be accessed by (in 
other words, sent to) Sam's computer. (This data would, of 40 
course, be sent without any identifying information that 
would enable the sender to learn what individual was 
associated with what tracking data.) Software running on 
Sam's system could then decide which of these users are 
most similar to Sam, and make subsequent decisions about 45 
which ads to display for Sam. 

To make the process of sending other user's tracking data 
to Sam more efficient, the system could optionally be 
designed so that similar users were grouped into statistical 
clusters; all the people in one cluster would be more similar 50 
to each other than to people in any other cluster. 

Then, information describing the clusters could be sent to 
Sam's machine, which could decide which cluster Sam was 
in. A variety of different types of information could be sent 
to Sam's machine describing each cluster. For instance, the 55 
average amount of time spent on each tracked Web site, 
where that number is computed from the data corresponding 
to all users in the cluster, would be a good candidate. For 
each cluster, this number could be sent for every tracked 
page (or for only a subset of the tracked pages, which could 60 
be chosen, for instance, for their statistical significance). 
Then, software running on Sam's machine could determine 
how closely each cluster matches Sam's activities; Sam 
would be considered to be in the cluster he matches most 
closely. 65 

Alternatively, instead of sending information about each 
user or cluster into Sam's computer, information could be 
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sent about the demographics which apply to each ad. These 
demographics could be supplied to the system by the adver- 
tiser or ad agency, or could be determined by a central 
computer by means of ACF as described elsewhere in this 
document. 

Since Sam's information would only be stored in his own 
computer, he would not have as many privacy concerns with 
regard to inputing demographic information. So there is a 
good likelihood that he would be willing to supply such 
information. If he did so, the system could optionally not 
store his tracking information. 

So, to determine which ads to display for each user and 
with what frequency and when, the software running on 
Sam's system could simply see how closely each ad matches 
his demographic data. 

Optionally, every time Sam clicks on an ad, his demo- 
graphic information could be sent to a central database, 
where it would be used to analyze the overall demographics 
of people who click on the ad. However, no identifying 
information for Sam need be sent or stored. 

The technique of storing all tracking data on Sam's 
machine could be implemented, using technology available 
at the end of 1995, with Netscape's protocol for Inline 
Plug-Ins. Inline Plug-Ins, unlike Java applets, have the 
ability to write directly to the user's hard disk. (The situation 
for Java applets may change in the future, and other tech- 
nologies may emerge that can accomplish the same 
purposes.) This ability is essential for storing the user's data. 

The Inline Plug-In could, if desired, handle all function- 
ality of displaying the ad, determining what ad to show, and 
reading and writing the relevant information from and to the 
hard disk. Otherwise, this functionality could be divided the 
between the Inline Plug-In and other software, such as Java 
applets. 

A separate application could be written, which the user 
could download, to manage his demographic information on 
his hard disk. This program could contain the user interface 
that would enable him to easily update the information. 

Security note: 

No matter which of these methods is used, the cookie 
mechanism provides a very high level of security. A user's 
randomly generated cookie is stored on the user's machine; 
and that is the one and only way information stored on a 
central database is associated with that user. The cookie 
mechanism is such that only programs with the same domain 
name as the one that created the cookie can read or modify 
it. So while the system's central server machine can track a 
user by means of the cookie, a program existing under a 
different domain name will not be able to access the cookie 
at all. 

Moreover, there is no need to store user-identification 
information such as email addresses (or phone numbers or 
postal addresses, etc., etc.) on the central system. So there is 
no way the company running the system will have the ability 
to do anything to intrude on the user's privacy. For instance, 
there would be no way that the tracking or demographic 
information could be sold as the basis of a mailing list (email 
or otherwise). The fact that such identifying information 
does not need to be stored in the database is a key feature of 
this invention. 

Tracking users. 

There are a number of possible ways to track users. Some 
will be presented here: 

1. Tracking by means of code on participating Web sites. 

Each Web page which contains a Smart Ad Box will 
contain code, which may be comprised of HTML, Java, or 
other languages, which will allow a user to be tracked. (This 
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code may work in conjunction with other software, such as these routines. For instance, there might be different prop* 

Netscape-style Inline Plug-Ins.) In addition, such means for erties associated with the different Tracking Scripts — for 

tracking a user can be embedded in pages that do not instance, in cases where the Tracking Scripts also draw the 

themselves display advertising. Pages which have the ability Smart Ad Boxes, different Tracking Scripts might draw 

to cause a central database to be updated with tracking 5 Smart Ad Boxes of different sizes. Again, it is essential that 

information will be referred to in this document as all of these Tracking Scripts be under the same domain tail 

"tracking-enabled." in order that they can all access the cookie. 

Every time a user visits a tracking-enabled page, a central The Tracking Script will examine the cookies passed to it 

database will be updated to show that that particular user to see whether one of them is the Tracking Cookie, 

visited that page. Optionally, the length of time spent on the 10 If it did receive the Tracking Cookie: 

page, or other such information, can also be stored. then this cookie will contain the identifier of the user; the 

One way to perform this update will be for the page to central database can then be updated with the ID of the user 

reference a CGI script which exists on the machine contain- and of the current page, and, optionally, other information 

ing the central database. This CGI script, which can be such as the time spent on the page, 

written in languages such as Perl, C, Basic, AppleScript, etc. 15 Optionally, the expiration date of the Tracking Cookie 

would perform the database updates. could be updated; for instance, it could always be set to one 

Alternatively, a CGI script can exist on the machine which year after the last Tracking Cookie access, 

is hosting the page. To perform updates, it would commu- If it did not receive the Tracking Cookie: 

nicate over the Internet with a program which exists on the then it creates it. The value of the Tracking Cookie could 

machine hosting the database. 20 be generated using a random number generator; one of many 

Some types of information that an embodiment of this other alternatives would simply be to pick a number one 

invention could choose to store for each user: greater than the last value generated. 

An identifier of the page the user visited. ^ Tracking Cookie is then stored on the user's machine 

. tU r _ . M using the Netscape cookie mechanism; each time from then 

The length of time spent on the page. * r . . ^ . . ' , .... 

_ ° _ , . ...... . 25 on that a user visits a tracking-enabled page, the stored 

The amount of money spent by the uscrwMe vis,tmg toe Jn( ^ Cookie will 5e ^ l0 re . ident if y that user. The 

page (or site)-this would be useful for Web sites that Jndd sh(mld ^ ^ d a[) iration dat6 ^ 

3 re rets ii stores 

that it doesn't disappear when the user leaves. The expira- 

The identifier of the ad displayed for the user in the smart tion date be> for instance, one year in the future. 

ao< b° x - 30 Note that the Tracking Cookie will not allow anyone to 

Whether or not the user clicked on the ad. intrude on the user's privacy by sending him email or by any 

Whether or not the user rejected the ad (mentioned in an other means. There need be no way to associate the Tracking 

earlier section of this document). Cookie with the user's name, physical location, or any other 

An identifier of each particular item purchased— for personally-identifying information. 

instance, the ISBN of a book. 35 The techniques involved in writing these CGI's are 

There has to be a mechanism for re-identifying a user known to any competent practitioner of Netscape-related 

from session to session. CGI programming. 

The preferred embodiment involves using the Netscape There are other ways to track users, such as using envi- 

Navigator's cookie mechanism. This will allow us to accu- ronment variables such as REMOTE_ADDR, RE MOTE _ 

rately track the 70% to 80% of Web users who use the 40 HOST, REMOTE_JDENT and the header field HTTP„ 

Netscape product. In addition, it will work with any Web FROM. These are known to any competent practitioner of 

browser that has a cookie mechanism compatible with CGI programming. Moreover, other methods will probably 

Netscape's. It is likely that the vast majority of Web brows- become practical in time. So the cookie mechanism is not 

ers will, in time, have such compatibility. The preferred way required, but does have advantages, 

to use it is as follows: 45 2. Tracking by means of software that runs on the user's 

Each time the user references a tracking-enabled page, a machine whenever he is browsing the Web. 

CGI script is executed. This CGI script, referred to in this The problem with tracking by means of code on partici- 

document as the Tracking Script, is referenced by each pating Web sites is that it only enables users to be tracked 

tracking-enabled page; that is, all tracking-enabled pages, while they are visiting participating sites. So the amount of 

* which may be spread out over many different host machines, 50 information that can be gathered is limited in that way. 

will all call the same Tracking Script which exists on just It would be much better to be able to track all sites visited 

one machine or networked group of machines accessible by a user. 

through the one URL. (It is necessary for the Tracking Script The challenge is to get the code that does this tracking into 

to exist under just one domain tail in order to receive the the user's machine. Users don't want to manually download 

cookie no matter which page the user is on. A CGI on a 55 software unless they clearly understand that there is a 

machine with a different domain tail will have no way to fundamental benefit in it for them, 

access the cookie.) A domain tail consists of the "tail end" It might be thought that a Java applet would be the perfect 

of the domain name such that the included portion contains means to track user activity over time. But current imple- 

at least 2 or 3 periods, depending on the particular top-level mentations of Java automatically flush Java applets from the 

domain. For more on this, see Netscape's technical note on 60 cache whenever the user moves to a domain other than the 

the Web at http://www.netscape.com/newsrefystd/cookie_ one the Java applet originally came from. So Java currently 

spec.html or any other documentation on Netscape's cookie has limited usefulness for this purpose, 

mechanism, which is a public protocol that any competent Alternatives: 

practitioner is familiar with. Building tracking code into the Web browser itself. 

Alternatively, a set of Tracking Scripts could be made 65 Probably the ideal methodology would be to build the 

available, all under the same domain tail, such that a tracking code into the Web browser itself. In fact, Web 

particular participating Web site could interact with one of browsers do already have some tracking code built-in; for 
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instance, the Netscape Navigator has a "Go" menu which continuously updated news headlines or weather reports. A 

lists sites previously visited in the current session. This further example might be showing the status of the user's 

information is lost, however, when the user Quits Netscape email box. 

Navigator. Continuously updated content would only be possible for 

A Web browser could automatically open up a socket for 5 users with continuous Internet connections. That situation 

communications with the central database. At intervals, it currently is common in office situations, but not in home 

could send tracking information to the central database situations. Of course, it is quite possible that that situation 

without any participation on the part of the user. This will be changing in the future. For instance, cable companies 

information could include, for example, the URL of each ma y m tne relatively near future offer continuous Internet 

page visited and the amount of time spent there. 10 connections at an inexpensive price 

Tne only real drawback to this technique is that it requires . For users Wlth t 0ut continuous Internet connections, 

the participation of the companies which create the browser however, semi-continuous content could be made available. 

-Z' r r For mstance, the software could enable an Internet connec- 

50 are * . tion for a brief period in every hour (or other interval) during 

Tracking the user by means of software running on the which news headlines> wea ther, stock prices, or other infor- 

user's machine simultaneously with the browser, using 15 mat j 0D cou ]d be downloaded and displayed during periods 

software that has no user interface (or a minimal user c f inactivity during the intervening hour, 

interface). Alternatively, the Internet screensaver could simply wait 

The user could download an application or other type of for the user to log on to the Internet, and download content 

software (such as a Macintosh-style control panel or system at that time to be used as content until the next user-initiated 

extension) which could track his activities and communicate 20 Internet connection. Again, the content could include news 

them to the central database. stories, etc.; it could also include less timely content such as 

This software could operate with the cooperation of the comic strips. 

Web browser. For instance, the Netscape Navigator allows Additionally, the Internet Screensaver could interact with 

user activities to be communicated to separate applications; the operating system of the user's computer to perform other 

on the Macintosh, this mechanism is based on Apple Events. 25 functions. 

To motivate the user to download this software, an For instance, it could periodically retrieve the exact time 

incentive could be given. For instance, the user could be from a clock residing on the Internet, and then use that 

offered a great deal on an advertised produchior promised a information to set the clock in the user's computer. (For 

number of such great deals in the future. In fact, users could instance, as of Dec. 26, 1995, the current time according to 

even be paid to download the software. 30 the US Naval Observatory is available on the World Wide 

Tracking the user by means of software running on the Web at http://tycho.usno.navy.mil/cgi-bin/timer.pl) Based 

user's machine which is of its own benefit to the user, on comparing the true time to the computer's internal clock, 

separate from its tracking functionality. a "drift" factor could be computed. The screensaver could 

It would be best if users could be motivated to download then update the clock at regular intervals to compensate for 

the tracking software without being offered special deals or 35 expected drift; it could do this between access to the true 

financial rewards. For this to be the case, the software has to time over the Internet. 

provide some benefit of its own. (Note: the Internet screensaver would have commercial 

One example of this would be a screensaver. Screensavers value even without its relationship to the advertising para- 
typically run all the time, although they only take over the digm discussed in this paper, if, for instance, the user- 
screen when the user is inactive. 40 tracking capabilities were to be omitted. For instance, many 

A screensaver that had some desirable properties com- Web sites would benefit from publicizing themselves by 

pared to other screensavers available in the marketplace, and means of providing content to the Internet Screensaver.) 

that was inexpensive or free of charge, would be likely to be (Additional note: the Internet Screensaver does not nec- 

downloaded from the Web by quite a few users. essarily have to be a separate piece of software from a Web 

One way that such a screensaver could differentiate itself 45 browser. A Web browser could itself be a screensaver, 

from other current screensaver products is by means of its through the addition of screensaver-related capabilities such 

ability to communicate over the Internet, which is required as the ability to sense user inactivity, the ability to bring 

for its tracking functions. itself into the foreground when user inactivity is sensed, and 

For instance, this screensaver could be designed so that it the ability to completely take over the screen so that only the 

could display HTML and/or execute Java code. In fact, it 50 desired screensaver content is visable [usual menus, etc. 

could have much of the functionality of a Web browser. (Or, would be hidden]. Screensaver content is usually, but not 

it could use its own protocols for displaying images and text exclusively, a dark screen containing moving images. Such 

on the screen, different from those used in Web browsers. a Web browser could use its regular graphics abilities to 

However, it would probably be best for it to use standard display screensaver content in the form of HTML, Java, 

protocols.) 55 JavaScript, or other protocols.) 

Thus, companies and individuals could provide content Using bookmark files already stored on disk by popular 

for this "Internet Screensaver." Users could choose the URL Web browsers. 

they want to be connected to, for instance, by means of a Bookmark files contain a form of tracking information, 
menu that was automatically updated to show all URL's They list the sites the user visited and liked enough to want 
which supply Internet Screensaver content. Alternatively, 60 to be able to easily visit again. Also, Netscape's bookmarks 
the list of URL's could be in the form of HTML or Java file, for instance, contains the dates that the user created the 
output displayed on a screensaver page. Or a dialog box bookmark for each site as well as the date the user last 
could be used, etc. (The list of such URL's could be visited the site. These could be used to make inferences 
communicated from a central site across the Internet.) about how useful the user finds the site — for instance, if he 
An example of content that would motivate many users to 65 bookmarked a site a long time ago and visited it very 
download the Internet Screensaver would be a continuously recently, it's fairly likely that it's one of his more frequently- 
updated stock ticker. A couple of other examples would be accessed sites. 
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The WebHound Web site (now called WebHuDter) which Web site. The participating sites themselves could choose 

was produced by the MIT Media Lab does, in fact, use their ID codes and/or passwords, or they could be assigned 

bookmark files to facilitate recommendation of Web sites by by the software. 

means of automated collaborative filtering. (It is possible that this same ID code could be "hard- 

The problem with this technique is that there is currently 5 coded," or directly incorporated, into the code on a partici- 

no automated way that a Web site can acquire the content of pating Web page which calls the Smart Ad Box CGI in order 

a user's bookmark file. 1° identify the site. The instructional site would, of course, 

A Java applet would be a candidate, except that security explain how to do this.) 

restrictions currently prohibit Java from reading files on the In an embodiment where advertisers have ID codes and/or 

user's hard disk. Lifting this restriction in the case of 10 passwords, they should be able to go to a particular page and 

bookmark files would solve this problem. type in that information, and, in return, be enabled to see the 

Another way to track users is the following: amount of money they have earned to date. 

Code can be provided to a number of Web sites that enable (Alternatively, instead of using ID codes and passwords, 

them all to access the same central database when a user logs an embodiment could identify companies by other means, 

in, enabling the user to use the same user ID and password 15 such as checking a cookie on the client machine or simply 

on many different Web sites and potentially freeing each allowing companies to type in the company name.) 

Web site from the need to have a database for checking Moreover, in the one embodiment, Web sites who would 

whether each user had already registered. This code can ^ t0 become paid participants would be able to accom- 

update a central database to show which participating sites P lisn everything needed online, without manual interven- 

have been accessed by each user. 20 tion. This would save considerable money, and make it even 

Ease of implementation. more practical to allow small Web sites to participate. 

It would be valuable for embodiments of this invention to In other words, there would be a Web page with (possibly 

make it very easy for Web sites to participate. among others) the following attributes: 

To do this, a Web site (or, perhaps, a page or set of pages) Prospective participants could input (or receive a 

should be made available that contains complete instructions 25 generated) ID code and a password. Of course, the 

on how to set up a participating page. Instructions should system would check to make sure that this ID code was 

explain how to place a Smart Ad Box on a page, as well as not already in use by some other company, 

how to enable the tracking of users on a page (if the Prospective participants could input whatever information 

embodiment involves separate code for tracking and for the is required for payment; for instance, if physical checks 

Smart Ad Box). 30 are to be sent through the mail, this information would 

The code could be designed in such a way that there need include the address and the name to make the check out 

be no direct communication between the people supplying to. 

the Smart Ad Box service and related services and the people Then, based on the above information, the system could 

who want to enable their Web site to participate in those automatically cause payment to be made. Checks could be 

services. Any competent practitioner could design such 35 printed, or funds transmitted by electronic means, all with no 

code. Furthermore, it should be designed in such a way that (or minimal) human intervention. 

the modifications required to enable a Web page to partici- In order to keep expenses down, the system could option- 
pate are minimal. Again, any competent practitioner could ally be programmed not to send a payment until the money 
design such code. owed to the participating sound exceeded some preset 

Thus, an instructional site would enable participation in 40 amount. This way, the expense of sending the payment will 

the service to grow rapidly. Web sites could very easily only be a small percentage of the funds involved, 

become participants on a trial basis. It must be stressed that there are other ways of enabling 

It is an important consequence of this invention that customers to input the information discussed in this section, 

relatively small Web sites (small in the sense of a relatively For instance, multiple Web pages could be involved in the 

small number of daily visitors) will be able to become 45 input process, or, as just one more example, if the embodi- 

participants. Because no human involvement is required on ment in question involved telephone communications, part 

the part of the company supplying the Smart Ad Box and or all of the input process could occur by means of pressing 

related services, there is much less of a barrier to the the keys on a touch -tone telephone. (Of course, Web page 

involvement of these small sites in advertising. Normally, input is based on hardware such as a video screen, keyboard, 

the manpower associated with making agreements between 50 random access memory, etc. The techniques described here 

individual advertisers and ad agencies and individual Web are a method for enabling this hardware to achieve the 

sites is prohibitive enough that no such agreements are made desired ends.) 

with small sites. Thus, this invention will enable many small It should also be noted that the techniques described in 

sites to earn money from displaying advertising. The largest this section are also useful for advertising systems that do 

expense involved in dealing with an individual participating 55 not involve automated collaborative filtering; as one 

Web site might be the expensive of writing and mailing a example, consider a system that simply uses demographic 

check; of course, Internet banking may soon lower that cost. information supplied by the individual users in order to 

Optionally, the Web site discussed in this section (or a decide which ads to display to whom. Such a system could 

separate site) could allow participating Web sites to deter- use the techniques described here to enable Web sites to 

mine the amount of money they have earned to date by 60 participate without human intervention, again leading to the 

virtue of their participation. (It is expected that advertisers cost savings which would make it very practical to allow 

will pay the company offering Smart Ad Box and related very small Web sites to participate, 

services, and that this company will pay the participating It is of significant value to enable these small Web sites to 

Web sites.) participate, because a large amount of the time of many who 

For instance, each participating site could have an iden- 65 use the Web is spent visiting such small sites. Making that 

tifying code and/or password which they acquire through space available for advertising adds significantly to the 

interaction with the instructional Web site or some other potential revenue stream. 
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Automated Collaborative Filtering There are a number of ways we can use this list of similar 
ACF is a field of research which has been receiving 50016 of which are described below, 
attention in recent years from such organizations as If me current embodiment is one in which users can 
BellCore and the MIT Media Lab. I myself devoted 18 (perhaps optionally) provide demographic information, 
months over the last few years to research in this area. An 5 then some users who are similar to Joe will, most likely, 
MIT Media Lab spinoff company called Agents, Inc. has a have supplied such information. Since they have simi- 
Web site which uses this technology for making personal- lar interests to Joe, there is a probability that their 
ized recommendations of music CD's. Upendra Shardan- demographic backgrounds will also be similar to Joe's, 
and's 1994 Massachusetts Institute of Technology (MIT) The software can therefore make intelligent guesses about 
Media Lab Master's Degree thesis, entitled Social Informa- io Joe's demographic data. For instance, with regard to age, the 
tion Filtering For Music Recommendation (hereby incorpo- software can compute the average age of the people close to 
rated by reference) is a good write-up of their basic tech- Joe who have supplied us with their ages. The same idea 
nology. holds for income level. The software can guess items such 
The basic idea, as applied to Smart Ad Boxes, is as as sex by extrapolating from the most common sex of people 
follows. 15 w ith similar interests to Joe. Similarly, the software can 
Suppose we want to decide whether a particular ad is ma ke intelligent guesses about other categories of demo- 
likely to be of interest to a particular user, say, Joe. We want graphic data. The specific technique used to make the 
to use automated collaborative filtering; we may be using extrapolation isn't the concern here. The point is that an 
this technique in addition to other methods for matchings extrapolation can easily be made. 

ads to users. 20 Thus, if the advertiser has given information which is 

First, we need to decide which other users are similar to stored m the system about what me target audience for an ad 

Joe in their interests. A list of similar users can be stored in ^ men the can check t0 ^ wmc h ads are most 

the database, or can be generated ' on the fly Ideally, we w u ta ted for Joe However> even tf the adv ertiser 

would also compute a number representing the degree of ^ ^ tne can examine the 

^ likely simuanty of mterests.^^ data to ^ which demographic groups have showed the 

U can be based on this number: for example the most similar mQSt ^ eacfa ad _^ Q ^ s ^ j ^ 

£ person to Joe is at the top of the list, and each successive information if the advertiser doesri > t . 

„ entry displays less similarity until some cutoff point is Qne interesti a ^ of this technique occurs if Joe ' s 

*j reached, beyond which people aren t added to the list. imerests afe atypica] fof fais demographic group . For 

^ E To compute these degrees of similarity, we need data. This 3Q ms tance, some people in their 50's have interests that are 

hj data will involve the information we have stored in our more common for people m lheir 30 ' s . Thus, the technique 

T j database by tracking each user over time. (Optionally, it described here may incorrectly come to the conclusion that 

^ could also involve other information such as demographic Joe ^ in ^ 3^ wheQ he > s really m his 50's— but that 

4» data supplied by the user; but by not relying on such data, erroneous conclusion would actually lead, in this case, to a 

f?j we eliminate the need for the user to actively participate in 35 better targeting of advertisements. If Joe's interests are 

this process in any way.) closer t0 those of a v>CTSOn in his 30^ then ads Erected to 

~_ From our database, the system "knows" which Web sites that age group are the ads that are most likely to be of 

Q Joe has visited, and, possibly, how often he has visited each interest to him. 

fj] one, the amount of time spent at each one, which ads he of course, this also applies to users who have supplied 

" - clicked on or rejected, and/or other information. We will ^ some, but not all, of any requested demographic data. This 

l y have collected similar information with respect to other should not be construed to mean that my invention ONLY 

users. involves making extrapolation about demographics by 

Q In order to compute a degree of similarity of interests means of such ACF techniques as were explicitly mentioned 

-~~ between Joe and a particular user, we compare the stored in the documents, such as John Hey's and the MIT Media 

^ tracking information. In some cases, Joe and the other user 45 Lab's (Shardanand's). It also includes making extrapola- 

will have visited none or very few of the same sites. In other tions based on other versions of ACF, some of which may 

cases, the tracking information will show that they are have very different degrees of sophistication from the men- 

remarkably similar in the sites they've chosen to visit; this tioned ones. 

would be indicative of a high degree of similarity of inter- Whether or not the users supply demographics, there are 

ests. 50 a number of possible "pure ACF" approaches, some, 

Certain mathematical and statistical techniques can be but not all, of which will be discussed here, 

used to compute a number which represents the amount of 0 One approach: 

likely similarity of interests in a meaningful way, based on For every ad, we can consider the list of people who are 

such profiles. Such techniques are described in the Shardan- similar to Joe, and compute the ratio of clicks to impres- 

and thesis, John Hey's U.S. Pat. Nos. 4,870,579 and 4,996, 55 sions. For example, if there were a total of 1000 impressions, 

642 (hereby incorporated by reference). While these tech- and 10 people clicked on the ad, the ratio would be 10/1000 

niques are usually described as being useful for deciding or 1/100. (An impression is one showing of an ad to a 

which pairs of people tend towards the most similar esthetic person.) 

judgments, the techniques apply equally well to their basic This ratio provides a very rough measure of the interest of 

interests in life, as manifested in, for instance, the types of 60 people on the list in the ad. The greater the ratio, the more 

Web sites they choose to visit and the types of ads the click interest is indicated. 

on. Therefore, the ad with the highest ratio would be consid- 

Now, let's assume that we have a list of the users who are ered to be the one most likely to be of interest to Joe; the ad 

most likely to be similar to Joe in their interests, and, with the second highest ratio would be the one second most 

optionally, that we have a number corresponding to each 65 likely to be of interest; etc. 

other user and describing the degree of similarity between These ratios could thus determine the frequency with 

him and Joe. which the system chooses to show Joe the various ads. 
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o Another approach: 

For each person on the list of people similar to Joe, we see 
how many impressions of a particular ad were required 
before he clicked on it. We assume that that number is 
related to the probability he had of clicking on an ad during 5 
a given impression. For instance, if he had 10 impressions 
before clicking on the ad, we might assume that P«l/10. If 
he never clicked on the ad, we would assume P=0. 

So, for each user, and for each ad, we will have a value 
for P. If we average these values together for each ad, we'll 
have an average P for that ad, and can use that to determine 
which ads are more likely to be of interest to Joe. 

Now, we showed earlier that it is possible to compute a 
numerical measure of closeness to Joe. These measures 
could be used as weights. Instead of doing a simple average, 
we can take the weighted average of all the users, where the 15 
weights are determined by each user's closeness to Joe. 

One could use the closeness measures directly as weights. 
Alternatively, they could be transformed. Shardanand's the- 
sis gives some methods for transforming similarities to 
weights that have proven to be effective. 20 

Genetic programing could be used to evolve algorithms to 
transform the similarities to weights. The fitness function 
would be the algorithm's success in predicting which ads are 
of interest to Joe. For purposes of the genetic programming 
process, the fitness function would measure how good a 2 s 
particular algorithm is at "predicting" how interested Joe 
was in ads that he has already been exposed to, and where 
we have already counted how many impressions it took him 
to click on them. 

Other methods for determining weights can be generated 3Q 
by trial and error or by other means. 

A more statistically sophisticated approach: 

Again, we only consider the people who are similar to Joe 
in taste. 

We will again compute the ratio of clicks to impressions. 
Call it R. 35 

For each ad, there is a probability P that people as similar 
in taste to Joe as those on the list will click on it in a given 
exposure. 

P, not R, is the number we're really interested in. R is just 
a rough estimate of P. The reason for this is that we only have 40 
a limited amount of data available to us; if we had an 
unlimited amount of data, R would be the same as P. 

We can't compute P exactly, but we can find P A such that 
we can reject the null hypothesis that P=P A with a confi- 
dence level of A, which might typically be 0.05. (In other 45 
words, there would only be a 5% chance that P^P A .) Thus, 
P A , as opposed to R, is a number that we can have a known 
degree of confidence in. 

There are a number of different approaches for computing 
P A . One of these uses the cumulative binomial distribution 50 
together with a search algorithm. In one embodiment, the 
search algorithm successively tests possible values for P A at 
certain fixed intervals, for instance, 0.01, 0.02, 0.03, . . . , 
0.99 until it finds the greatest P A such that we can reject the 
null hypothesis that P=P A with confidence level A. In 55 
another embodiment, a binary search mechanism is used to 
accomplish the same goal more quickly and/or accurately. 
We'll refer to the chosen search algorithm as S( ). Let C( ) 
be the cumulative binomial probability distribution function. 
[One is described in Press, Teukolsky, Vetterling, and Flan- 60 
nery 1992, Numerical Recipes in C, 2nd Ed, (Cambridge, 
England: The Cambridge University Press) p. 229; the 
relevant sections are hereby incorporated by reference.] 

The inputs to C( ) will be: 

1. An assumed value for P; these assumed values are 65 
generated by S( ) in order to see how consistent or 
inconsistent they are with the evidence. 
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2. The number of impressions of the ad (that is, the 
number of times people were shown the ad). From a 
statistical point of view, each impression is considered 
to be an experiment. 

3. The number of clicks on the ad in question. From a 
statistical point of view, each click is considered to be 
a success in the experiment. 

C( ) will calculate the probability that the presented 
combination of (2) and (3) (or a greater number of 
successes) could have occurred by chance alone given the 
assumption of (1). If this is a low probability, it's unlikely 
that the assumption is correct. 

S( ) will repeatedly call C( ) with different trial values for 
P, until it finds one (P^ such that the the calculated prob- 
ability is A. If we have chosen a low A, then this would 
imply that we can confidently reject the hypothesis that 

p = p *- 

Then a relatively high value P A for a particular ad will 
mean that that is an ad we can be confident that Joe will be 
interested in. So P A can determine the order in which we 
present the ads; they are presented in reverse order of P A . 

Still another approach: 

An approximate value could be found for P A by means of 
a neural network. (The neural network could be trained 
using P A as computed above.) This would have advantages 
over the previous approach in execution speed. 

Alternatively, genetic programming could be used to 
evolve an algorithm that outputs an approximation to P A . 

The results of the previous two approaches could be 
computationally combined. 

For instance, if the system orders the ads by using 
demographic data and also orders the ads using a "pure 
ACF" approach, such that the most appropriate ads for a 
given user come first, then the two approaches could easily 
be combined by calculating the average position for each ad. 
In other words, the ad might be the nth ad in the 
demographic-based ranking and the mth ad in the "pure 
ACF" ranking; the combined rank could be (n+m)/2. 

There are an infinite number of other computational ways 
to combine the two techniques; very many of which could be 
constructed by any competent practicioner. 

In combination with such approaches as are described in 
this section, cluster analysis can be used. 

Instead of comparing Joe to each individual user, we can 
compare Joe to clusters of similar users. These clusters will 
be comprised of individuals with similar demographics 
and/or tracking histories. The degree of similarity between 
people will be computed as described above; in each cluster, 
each individual will be more similar to people in his own 
cluster than to people in other clusters. 

Such an approach can be more computationally efficient, 
since Joe would only need to see which cluster he is 
associated with, rather than comparing himself to all (or a 
substantial subset) of the set of individual users. 

Most ads will be more of interest to people in some 
clusters than others; this can be determined by techniques 
such as those described above, but applying those compu- 
tations (such as the cumulative binonial distribution) to 
clusters rather than to individuals. 

One other aspect of using ACF to decide which ads to 
show to which users should be noted here. The system has 
to collect data on a number of users which shows whether or 
not they responded to particular ads. Then, when, for 
instance, the system needs to compute the priority with 
which we should consider showing a particular ad to Joe, it 
finds users with similar profiles to Joe and who it has 
knowledge about whether or not they responded to the ad. 



